labs: entire investigate — multi-agent investigation loop by alishakawaguchi · Pull Request #1231 · entireio/cli

alishakawaguchi · 2026-05-19T23:16:32Z

https://entire.io/gh/entireio/cli/trails/397

0520.mp4

Summary

Adds entire investigate (labs / hidden) — a round-robin multi-agent investigation loop that drives claude-code, codex, and gemini-cli through turns appending findings/evidence/stances to a shared findings doc until quorum / stalled / paused / cancelled.
Subcommands: fix [run-id], show [run-id], clean [run-id|--all].
Inputs: [seed-doc] positional, --issue-link <url> (gh-resolved with userinfo redaction + untrusted-content envelope), or — when the spawn-time picker fires — an in-picker "Investigation prompt".
Lifecycle: per-turn pending_turn state.json contract, env-var provenance handshake (ENTIRE_INVESTIGATE_*) adopted by UserPromptSubmit hook, condense into checkpoint metadata on commit, HasInvestigation umbrella flag on the checkpoint summary surfaced via entire status and the soft-warn guard.
Resume: --continue <run-id> reloads RunState, rewrites the manifest with the new terminal outcome (no longer leaves stale "paused" records).

Supporting refactors (review/investigate boundary)

New leaf packages, each consumed by both commands so the duplication is gone:
- `cmd/entire/cli/tuiutil/` — width-aware text helpers + `FormatDuration`
- `cmd/entire/cli/gitexec/` — `git` CLI runner (separates stdout from stderr-as-error-context)
- `cmd/entire/cli/uiform/` — accessible huh form constructor + `PromptYN`
- `cmd/entire/cli/provenance/` — single source of truth for ENTIRE_REVIEW_* / ENTIRE_INVESTIGATE_* env contract
`lifecycle.go`: `adoptReviewEnv` + `adoptInvestigateEnv` share `tryAdoptEnv(spec)`.
`investigate/show.go` and `clean.go` share `ResolveByRunID` for exact-then-prefix resolution.
`investigate` uses `jsonutil.WriteFileAtomic` / `checkpoint/id.Generate` / agent name constants instead of re-implementing.

Security hardening (from adversarial review)

`runGhExec` redacts URL userinfo (`https://user:TOKEN@github.com/...\`) before any arg reaches an error string — earlier credential-redaction only covered the seed-doc / log paths.
`--issue-link` now requires an interactive y/N before launching agents with permission/sandbox bypass; non-interactive callers see the warning on stderr and proceed (CI / scripted use is not hard-blocked).
Issue body + comments are wrapped in a `<untrusted source="...">` envelope and any literal `` inside the body is defanged with a zero-width space.

Test plan

`mise run fmt && mise run lint` clean
`go test ./cmd/entire/cli/...` all green
Manual smoke: `entire investigate <seed.md>` (single agent, multi-agent picker)
Manual smoke: `entire investigate --issue-link ` (accept + decline branches)
Manual smoke: pause (Ctrl+C) → `entire investigate --continue ` → reaches quorum → manifest rewritten with new outcome
Manual smoke: `entire investigate fix `, `show `, `clean ` / `clean --all`

🤖 Generated with Claude Code

Note

High Risk
High risk due to introducing a new command that spawns external agent processes with sandbox/permission-bypass flags, adds deletion functionality (investigate clean), and extends checkpoint metadata/wire formats (HasInvestigation + investigate fields) that affect on-disk persisted data.

Overview
Adds a hidden labs entire investigate command that runs a round-robin multi-agent loop (with resume via --continue), bootstraps a per-run findings doc (from seed doc, issue link, or picker prompt), persists run state/manifests, and provides fix, show, and clean subcommands (including confirmed deletion of saved investigation artifacts).

Introduces a shared spawn.Spawner interface plus concrete spawners for claude-code, codex, and gemini-cli (using non-interactive invocation and permission/sandbox-bypass flags), and adds agentlaunch.LaunchFixAgent to start follow-up “fix” sessions while stripping review/investigate provenance env.

Extends checkpoint/session metadata to record investigation provenance (investigate_* fields) and a HasInvestigation umbrella flag, propagates it through v2 checkpoint summary merge logic and entire explain --json export, and adds tests (unit + integration) pinning the new on-disk/JSON wire formats and adoption via ENTIRE_INVESTIGATE_* env vars.

^{Reviewed by Cursor Bugbot for commit f8edb81. Configure here.}

Ships `entire investigate` as a hidden top-level command surfaced through `entire labs`. Runs a non-TUI marvin-style round-robin loop across launchable, hook-enabled agents (claude-code, codex, gemini-cli), bootstraps a findings doc + timeline, and tags each spawned session with a new agent_investigate Kind so the next commit condenses the investigation onto entire/checkpoints/v1 alongside review runs. GitHub issue/PR URLs seed the loop through gh; a local manifest plus an `entire investigate fix` helper feeds accepted findings into a follow-up coding session. Investigate gets its own HasInvestigation umbrella flag and Kind.IsInvestigate() predicate rather than reusing review's umbrella, so the feature stays semantically distinct. Includes a small spawner refactor: extract a shared Spawner interface under cmd/entire/cli/agent/spawn/ so review and investigate share the per-agent argv builders. Review argv is byte-identical post-refactor. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Entire-Checkpoint: 5d9432613a5d

- runContinue now reloads settings.Investigate.AlwaysPrompt so --continue preserves the user's configured preamble. Previously a Ctrl+C plus resume silently dropped it. - ParseStanceFromTimeline's "headingFound" return is now consumed: an agent that exits 0 but writes no turn heading counts as a soft failure, so two consecutive missing-heading turns trip pause-on- failure rather than burning the budget silently. - adoptInvestigateEnv validates EnvRunID via investigate.IsValidRunID before tagging state. An empty or non-12-hex run ID is rejected with a logged warning so junk run IDs cannot leak into checkpoint metadata. - ResolveIssueLink uses url.Redacted() for log/seed-doc URLs so a basic-auth credential embedded in --issue-link never reaches stderr, logs, or the findings doc. - Banner prints topic via %q to neutralize ANSI escapes. - openTurnLog documents that concurrent --continue on the same run id is not supported (single-shell continue only). Tests added for each fix. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Entire-Checkpoint: 6fce1174a808

1. Prompt injection via --issue-link: wrap untrusted issue/PR body and each comment in a labeled <untrusted source="..."> envelope so a well-aligned agent treats the content as data. Defang any literal close-tag inside the body so the envelope is not breakable. Add a regression test that pins both the wrapper and the defang. 2. Resume crash on shrunken --agents: refuse rather than panic when persisted NextAgentIdx exceeds the (possibly overridden) agent list. Surface an actionable error pointing at the state file. 3. Per-turn log unbounded growth: wrap the log file in a 16 MiB-capped writer that drops the tail and emits a single truncation marker. The capped writer reports len(p) so exec.Cmd never sees a short-write teardown signal. Verbose tee output remains uncapped (terminal flow control bounds it). 4. Settings load failure on --continue silently dropping AlwaysPrompt: surface a visible warning when the settings file is broken on resume, so the user notices their preamble has disappeared instead of seeing unexplained agent behaviour change. 5. RunState.Round vs TurnStance.Round semantic conflict: rename RunState.Round to CompletedRounds (0-indexed completed-pass count) so it is clearly distinct from TurnStance.Round (1-indexed round-this-turn-belongs-to). Document both fields. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Entire-Checkpoint: 3d52e957d3cc

The investigate loop now drives a Bubble Tea dashboard modelled on `entire review`, with one row per agent showing AGENT / STATUS / DURATION / TURN / APPROVED. Progress is delivered through a small `ProgressSink` interface so the loop itself stays free of any TUI dependency: TTY runs get the dashboard, non-TTY runs (CI, redirected stdout, agent-host invocations) get the same two-line shape today's log output produces. The `--verbose` flag is removed — per-turn agent stdout still lands on disk at `<git-common-dir>/entire-investigations/transcripts/<run-id>/ turn-N-<agent>.log`, which is the canonical place to inspect raw output post-hoc. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Entire-Checkpoint: df0fb75a3046

Mirrors the review-side change to keep agent picker output out of the committed project settings. Reads/writes only the local file so other settings fields are preserved. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Entire-Checkpoint: cd436b111b04

Reflects the storage change in fix(investigate): store config in .entire/settings.local.json. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Entire-Checkpoint: 4e825ed88af2

Prompts the user on the next `entire investigate` invocation to move the investigate config out of committed project settings and into .entire/settings.local.json. Non-interactive runs print guidance and continue. Existing local investigate config is preserved. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Entire-Checkpoint: 8349c994a7eb

The migration runs before any --edit / --findings / --continue dispatch so users see the move-to-local prompt on the next invocation regardless of which subcommand they reach for. Also harden loadProjectInvestigateSettings to fail open on malformed project settings JSON: the user already sees a parse error from settings.Load downstream, and blocking the migration prompt on bad JSON would make `entire investigate` unusable in the exact situation it exists to help recover from. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Entire-Checkpoint: ceee2ec8b984

Symmetric to TestHeadHasReviewCheckpoint_WrapperPreservesContract: when the checkpoint at HEAD has HasReview=true but HasInvestigation=false, headHasInvestigateCheckpoint must return false. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Entire-Checkpoint: 6cf6b7a5a660

Reads CheckpointSummary.HasInvestigation at HEAD via the existing headHasInvestigateCheckpoint wrapper and prompts the user (default Yes) before launching another run. Skipped for --edit and --findings modes and for non-interactive callers. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Entire-Checkpoint: bda6b287cd53

Returns the user's per-run agent selection plus an optional per-run prompt textarea. Mirrors review/multipicker.go. Wiring into cmd.go follows in the next commit. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Entire-Checkpoint: 6e239f58bc02

When 2+ agents are configured and --agents is not set, prompt the user for a per-run agent subset and an optional preamble. The preamble is joined onto AlwaysPrompt for this run only; settings are not modified. Mirrors review's multipicker UX. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Entire-Checkpoint: 5d5bb9154962

Adds a per-agent buffer of timelineEntry rows fed by turnStartedMsg and turnFinishedMsg. Lays the groundwork for the Ctrl+O drill-in view. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Entire-Checkpoint: 69124ca4e131

Extends ProgressSink.TurnFinished with a preview string parsed as the first non-empty non-stance line of the turn block. Consumed by the TUI sink and surfaced in the upcoming drill-in detail view. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Entire-Checkpoint: 29a8946fad70

Renders an agent's timeline buffer for the drill-in view. Output is padded to exactly termHeight lines to avoid Bubble Tea alt-screen ghost rows. Wiring into Update/View follows in the next commit. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Entire-Checkpoint: 4bbb6700b1ba

Adds detail-mode state to the TUI model and routes keyboard + mouse wheel events to scroll the per-agent timeline buffer. Esc returns to the dashboard; ←/→ cycle agents; ↑/↓ scroll. Detail mode toggles AltScreen and MouseMode on the rendered tea.View — mirrors the review package's pattern. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Entire-Checkpoint: a4d580798d15

- picker.go:183 — corrected post-save line to point at .entire/settings.local.json - cmd.go runContinue — wire spawn-time multipicker on resume when persisted state has 2+ agents; thread perRun into AlwaysPrompt via composeAlwaysPrompt - cmd.go soft-warn — emit logging.Info "running anyway (non-interactive)" when the user can't be prompted - multipicker.go — extract sortAgentChoices helper for testability - tests — add 3 missing spec-mandated tests: ResultSortedAlphabetically, PerRunPromptOptional, SoftWarnSilentInNonInteractive Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Entire-Checkpoint: d9fc3447ba6e

runInvestigate never called logging.Init, so every `logging.Info(...)` inside the loop hit slog.Default() — which writes a plain-text line to stderr. During a TUI run those stderr lines interleave with the dashboard redraw and produce visible garbage like 2026/05/12 15:52:00 INFO investigate: turn end ... turn=⡿ (the ⡿ is a spinner frame from the alt-screen leaking into the log line). Call logging.Init at the top of runInvestigate and defer Close, mirroring every other long-running command (attach, migrate, resume, hooks_git_cmd). Logs now land in .entire/logs/entire.log only. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Entire-Checkpoint: 2d9dc05d866b

Claude Code's `-p` (non-interactive) mode silently denies Write/Edit tool calls when no permission flag is set — there's no UI to answer the permission prompt. This made `entire investigate` unusable with claude-code: every turn the agent did its analysis but couldn't write the `## Turn N — claude-code` heading to the timeline doc, so the loop counted the turn as a soft failure and marked the agent ✗ failed. Add --permission-mode acceptEdits to the shared spawner argv. The flag auto-accepts edits to files the agent decides to write; it does not itself trigger writes. Review doesn't write files in practice, so the flag is a no-op for the review path — only investigate's behaviour changes. Tests update the pinned argv contract. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Entire-Checkpoint: c694e8346827

The previous stub captured \$2, which used to be the prompt under \`claude -p <prompt>\`. After 98df345 added --permission-mode acceptEdits, the argv shape became \`claude -p --permission-mode acceptEdits <prompt>\`, so \$2 captured the flag name. Iterate to grab the last positional so the stub survives future argv shifts. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Entire-Checkpoint: 089d400d1133

The directive was stripped by gofmt during the Tier 2 work — it sat on its own comment line above the function and got separated from the signature by a blank line, making golangci-lint disregard it. Restore the directive directly above 'func' so lint:go is clean again. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Entire-Checkpoint: 5ca4375d0ebf

The three lines Investigating: "<topic>" (run <id>) Findings: <path> Timeline: <path> were left visible above the live dashboard in TTY mode and duplicated the TUI's own title row — the dashboard then redrew below them, producing the noisy "stale rows above the live screen" effect the user reported. Skip the banner when the TUI will render. Non-TTY mode keeps the banner since the text sink doesn't surface those paths. Same change applies to runContinue's "Resuming investigation:" line. Also add the investigate/cmd.go path to the .golangci.yaml ireturn exclusion list: golangci-lint --fix kept stripping the nolint directive on buildProgressSink during fmt, and a per-path exclusion mirrors the existing review/tui_model.go entry which has the same abstract-sink-plus-concrete-handle return contract. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Entire-Checkpoint: f1a08334c505

single-select picker on --findings Two UX regressions from the Tier 2 work: 1. --continue resumes a paused run. The persisted state already captures the user's agent selection from the original run, so reopening the multipicker on every resume is friction. Worse, if the user accidentally deselects the agent whose turn is next, the resume refuses to proceed (NextAgentIdx exceeds list length). Trust the persisted state; --agents <csv> still narrows on resume for the rare case it's needed. 2. --findings reaches for the picker in TTY mode and shows just the chosen manifest's detail. Users who type --findings want to see all runs, not pick one — the `fix:` hint on each row gives them the next step. Always print the full list. Remove the now-dead promptForInvestigateManifest and printInvestigateManifestDetail helpers and their no-longer-needed imports. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Entire-Checkpoint: a0e16fb297cf

Each investigation turn already runs as a tagged agent session (Kind=agent_investigate) whose full transcript is condensed onto entire/checkpoints/v1 on commit — same machinery as review. The .log files written to <git-common-dir>/entire-investigations/transcripts/ duplicate that. Drop the capture entirely; route agent stdout/stderr to io.Discard. Removes ~150 lines of bounded-writer/log-rotation code and the LoopDeps.TranscriptDir field. Existing on-disk .log dirs aren't cleaned up by this change — they're harmless and the user can rm them when convenient (or via a future `entire investigate rm` subcommand). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Entire-Checkpoint: 8f3b007365b7

Two investigations on the same topic used to write to the same .entire/investigations/<slug>.md + <slug>-timeline.md, stomping each other's findings. New layout puts each run under its own subdir: .entire/investigations/<run-id>-<slug>/findings.md .entire/investigations/<run-id>-<slug>/timeline.md --output still overrides verbatim. Legacy on-disk investigations keep working because state + manifest persist the file paths directly; only new runs use the new layout. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Entire-Checkpoint: 7776d7ccdfc7

Per-run artefacts now live at: <git-common-dir>/entire-investigations/<run-id>/ findings.md <- the collaborative findings doc state.json <- cursors + stance history + pending_turn timeline.md is gone. The agent reports its stance by setting a `pending_turn` field in state.json after editing findings.md; the loop reads it after the agent exits, appends to stances[], and clears the field. ParseStanceFromTimeline, findTurnBlock, normaliseStance, and the turn-block regex globals are removed. Bootstrap no longer creates a timeline file. The --output flag is removed (escape hatch can be reintroduced if needed). The ENTIRE_INVESTIGATE_TIMELINE_DOC env var is removed; ENTIRE_INVESTIGATE_STATE_DOC is added so the agent can locate state.json. The TUI "findings" preview that D2 introduced now feeds the agent's pending_turn note straight through (parseFindingsPreview / readTimelineFile / findTurnBlock are all gone). State files move from <git-common-dir>/entire-investigations/state/ <run-id>.json to <git-common-dir>/entire-investigations/<run-id>/ state.json, alongside findings.md. The List path walks subdirs instead of *.json. Old state files (and the per-run dir under .entire/investigations/) become orphaned; this is accepted in the redesign. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Entire-Checkpoint: 9c6e606a9cf6

The findings doc is now a single converged answer the agents edit in place each turn, not a chronological log of attempts. New structure: ## Current understanding <- the team's best answer right now ## Supporting evidence <- claims tied to concrete refs ## Disputed / unverified <- what isn't yet confirmed No more numbered findings, no per-turn attribution in the doc, no Approach / Conclusion / Recommendations sections. Provenance for "who changed what" lives in the agent session transcripts on entire/checkpoints/v1 (recoverable via `entire checkpoint explain`), not in the doc itself. The agent prompt is rewritten to direct each agent to read, verify, and edit -- not append. Stance still reports via state.json's pending_turn field (unchanged from R1). Testdata prompt snapshots regenerated; bootstrap scaffold updated. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Entire-Checkpoint: 8465aba45120

On terminal outcomes (Quorum/Stalled) the loop now reads the final findings.md content into the manifest's new findings_content field, then RemoveAlls the per-run directory <git-common-dir>/ entire-investigations/<run-id>/. Findings survive in the manifest (parallel to review's AggregateOutput); the file is gone but the content is recoverable via `entire investigate --findings` or a follow-up `entire investigate show <run-id>` command. Paused/Cancelled runs keep the per-run dir untouched so --continue works and the user can read findings while a run is suspended. The --findings list prints "<captured in manifest>" instead of a file path when the run is cleaned up, so users know the content is still accessible. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Entire-Checkpoint: fb498711831a

Print a saved investigation's summary + findings without needing the per-run directory (which is auto-cleaned on Quorum/Stalled by R3). Findings come from the manifest's embedded findings_content for terminal outcomes, or from the on-disk findings.md for paused/ cancelled runs. Resolution accepts a full run id, a unique prefix, or no argument when there is exactly one manifest. Multiple-candidate cases print a candidate list rather than guessing. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Entire-Checkpoint: 343a9cb50805

The migration was scaffolded to move an "investigate" key from project settings (.entire/settings.json) to local settings (.entire/settings.local.json), prompting the user on every `entire investigate` invocation while the legacy key existed. Since the investigate feature has not shipped, no project anywhere has the legacy key — the migration is purely vestigial overhead on the cold path. Removed: - cmd/entire/cli/investigate/migration.go and migration_test.go - The maybePromptInvestigateSettingsMigration call in cmd.go's RunE (and the per-command PromptYN/canPrompt setup it required). - TestNewCommand_RunsMigrationBeforeDispatch. - settings.SaveLocalRaw — added in the prior commit only to give the investigate migration a typed write path; no other callers. Deps.PromptYN and the realPromptYN wrapper stay — they are still used by the HEAD-soft-warn ("a checkpoint at HEAD already has HasInvestigation set; run again?"). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Entire-Checkpoint: 98dac7e8cb76

runGhExec formatted strings.Join(args, " ") directly into its wrapped error. When --issue-link carries a credential (https://user:TOKEN@github.com/...), gh failure surfaces TOKEN through stderr and .entire/logs/. ResolveIssueLink already redacted the URL for its own log paths via url.URL.Redacted(), but the runGhExec error path was missed. The earlier TestResolveIssueLink_RedactsCredentialsInErrors stubbed runGhFn entirely, bypassing runGhExec's formatting code. Add a redactArgsForError helper that maps each arg through url.URL.Redacted() when it parses as a URL with userinfo; non-URL args pass through. Cover both the helper and the leaf redactURLUserinfo with unit tests that exercise the production format path. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Entire-Checkpoint: 4d2544eb5a3c

…ed issue seeds Two fixes from the Codex adversarial review of this branch. 1. runContinue never updated the saved manifest after a resumed run reached quorum/stalled. The fresh path routes through executeLoopAndCapture + writeRunManifest, but --continue used a thin executeLoop wrapper that discarded the LoopResult. After a paused -> quorum continuation, `entire investigate show / --findings / fix` saw the stale "paused" outcome with empty FindingsContent. Replace the wrapper call with executeLoopAndCapture + writeRunManifest. Reusing state.StartedAt keeps the manifest filename stable (<stamp>-<runID>.json) so the new write overwrites the paused record in place rather than creating a duplicate. WorktreePath isn't on RunState, so re-resolve via paths.WorktreeRoot — failure leaves the manifest written with an empty path rather than blocking the rewrite. Drop the now-unused executeLoop wrapper. New regression test: TestNewCommand_ContinueWritesTerminalManifest. 2. --issue-link feeds external GitHub content (issue body + comments) into agents that spawn with permission/sandbox bypass (claude-code --permission-mode bypassPermissions, codex --dangerously-bypass-approvals-and-sandbox). A malicious issue or comment can influence agent behaviour; the <untrusted> XML envelope is a prompt convention, not a real isolation boundary. Add a confirmation gate that fires right after resolveTopicAndSeed when issueSeed is non-empty. Interactive: prints the warning to stderr (including the source URL) and prompts y/N with default N; decline returns cleanly. Non-interactive: prints the warning and proceeds — CI / scripted callers passed --issue-link deliberately, and hard-blocking would break automation; the risk surfaces in operator-facing telemetry. New regression tests: - TestConfirmUntrustedIssueSeed_DeclinedExitsCleanly - TestConfirmUntrustedIssueSeed_AcceptedReturnsOK - TestConfirmUntrustedIssueSeed_PromptError Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Entire-Checkpoint: 404f400adb7b

Conflict resolution: cmd/entire/cli/review_helpers.go::headCheckpointFlags. Keep this branch's triple-return shape (hasReview, hasInvestigation, info) — both review and investigate consume it via thin wrappers — but adopt main's new reader path: newCommittedCheckpointReader + checkpoint.ReadCommittedCheckpoint, which routes through the configured v1/v2 store mix. Main had collapsed the function to review-only and used the new reader; this branch had the triple-return but the older ResolveCommittedReaderForCheckpoint call. The merged shape preserves both improvements. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Entire-Checkpoint: e7b3e33f1058

…references Mechanical comment sweep across the 30 new source files on this branch. Removed: - "Mirrors review/X" / "near-copy of" cross-package analogies - "Migrated from review.go" / "Kept here because..." breadcrumbs - File-header boilerplate that just named the file's role - Narrative "we deliberately ..." / past-state references - WHAT-restatements of well-named identifiers Kept the load-bearing WHY: security constraints, sentinel-value docs, go-git workaround notes, idempotence + threading invariants, and the godoc-required package comment on env.go. No code changes; build, vet, lint, and tests all green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Entire-Checkpoint: 463230cd5723

cursor

Cursor Bugbot has reviewed your changes and found 3 potential issues.

^{❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.}

Comment @cursor review or bugbot run to trigger another review on this PR

^{Reviewed by Cursor Bugbot for commit f8edb81. Configure here.}

Copilot

Pull request overview

Adds hidden labs support for entire investigate, a resumable multi-agent investigation loop, and refactors shared review/investigation utilities for prompts, provenance, git execution, TUI formatting, and fix-agent launching.

Changes:

Introduces investigation run state, manifests, loop execution, agent spawning, GitHub issue seeding, TUI/text progress, and fix/show/clean subcommands.
Adds investigation metadata propagation into session state, checkpoint summaries, status output, and explain JSON export.
Extracts shared utilities for accessible forms, display formatting, provenance env vars, git command execution, and fix-agent launching.

Reviewed changes

Copilot reviewed 88 out of 88 changed files in this pull request and generated 11 comments.

Show a summary per file

File	Description
`.golangci.yaml`	Updates lint allowances for new interfaces and investigate command code.
`cmd/entire/cli/agent/architecture_test.go`	Excludes shared spawn package from agent package discovery.
`cmd/entire/cli/agent/claudecode/spawner.go`	Adds Claude Code non-interactive spawner.
`cmd/entire/cli/agent/claudecode/spawner_test.go`	Tests Claude Code spawner command shape.
`cmd/entire/cli/agent/codex/spawner.go`	Adds Codex non-interactive spawner.
`cmd/entire/cli/agent/codex/spawner_test.go`	Tests Codex spawner command shape.
`cmd/entire/cli/agent/geminicli/spawner.go`	Adds Gemini CLI non-interactive spawner.
`cmd/entire/cli/agent/geminicli/spawner_test.go`	Tests Gemini CLI spawner command shape.
`cmd/entire/cli/agent/spawn/spawn.go`	Defines shared spawner interface.
`cmd/entire/cli/agentlaunch/launch.go`	Adds shared fix-agent launcher.
`cmd/entire/cli/agentlaunch/launch_test.go`	Tests provenance env stripping for fix-agent launches.
`cmd/entire/cli/attach.go`	Adds reusable review session tagging helper.
`cmd/entire/cli/checkpoint/checkpoint.go`	Adds investigation fields to checkpoint metadata types.
`cmd/entire/cli/checkpoint/committed.go`	Writes and preserves investigation metadata in committed checkpoints.
`cmd/entire/cli/explain_export.go`	Exposes investigation flags/fields in explain JSON export.
`cmd/entire/cli/explain_export_test.go`	Tests investigation JSON export behavior.
`cmd/entire/cli/gitexec/gitexec.go`	Adds shared git CLI execution helper.
`cmd/entire/cli/head_checkpoint_flags_test.go`	Tests HEAD review/investigation checkpoint flag helpers.
`cmd/entire/cli/investigate/bootstrap.go`	Builds initial investigation findings documents.
`cmd/entire/cli/investigate/clean.go`	Implements investigation cleanup.
`cmd/entire/cli/investigate/cmd.go`	Wires main investigate command flow and subcommands.
`cmd/entire/cli/investigate/cmd_internal_test.go`	Tests internal command helpers.
`cmd/entire/cli/investigate/env.go`	Defines investigate provenance env contract.
`cmd/entire/cli/investigate/env_test.go`	Tests investigate env handling.
`cmd/entire/cli/investigate/findings.go`	Implements local findings listing.
`cmd/entire/cli/investigate/findings_test.go`	Tests findings list output.
`cmd/entire/cli/investigate/fix.go`	Implements investigate fix prompt/launch flow.
`cmd/entire/cli/investigate/issuelink.go`	Resolves GitHub issue/PR URLs into investigation seed docs.
`cmd/entire/cli/investigate/loop.go`	Implements round-robin investigation loop.
`cmd/entire/cli/investigate/manifest.go`	Persists and resolves local investigation manifests.
`cmd/entire/cli/investigate/manifest_test.go`	Tests manifest persistence/resolution.
`cmd/entire/cli/investigate/multipicker.go`	Adds spawn-time multi-agent picker.
`cmd/entire/cli/investigate/multipicker_test.go`	Tests picker helper behavior.
`cmd/entire/cli/investigate/picker.go`	Adds first-run investigate config picker.
`cmd/entire/cli/investigate/picker_test.go`	Tests investigate config picker behavior.
`cmd/entire/cli/investigate/progress.go`	Adds progress sink interfaces and text sink.
`cmd/entire/cli/investigate/progress_test.go`	Tests text/null progress sinks.
`cmd/entire/cli/investigate/prompt.go`	Composes per-turn investigation prompts.
`cmd/entire/cli/investigate/prompt_yn.go`	Shares accessible y/N prompt.
`cmd/entire/cli/investigate/prompt_test.go`	Golden-tests investigation prompts.
`cmd/entire/cli/investigate/show.go`	Implements saved investigation display.
`cmd/entire/cli/investigate/show_test.go`	Tests show command logic.
`cmd/entire/cli/investigate/state.go`	Adds persisted run state store.
`cmd/entire/cli/investigate/testdata/prompt-first-round.txt`	Adds prompt golden file.
`cmd/entire/cli/investigate/testdata/prompt-mid-loop.txt`	Adds prompt golden file.
`cmd/entire/cli/investigate/testdata/prompt-with-always.txt`	Adds prompt golden file.
`cmd/entire/cli/investigate/tui_detail.go`	Adds TUI detail view rendering.
`cmd/entire/cli/investigate/tui_detail_test.go`	Tests TUI detail rendering.
`cmd/entire/cli/investigate/tui_sink.go`	Adds Bubble Tea progress sink.
`cmd/entire/cli/investigate/tui_text.go`	Adapts shared TUI text utilities.
`cmd/entire/cli/investigate_bridge.go`	Wires investigate deps from the CLI package.
`cmd/entire/cli/investigate_bridge_test.go`	Tests investigate root/bridge wiring.
`cmd/entire/cli/labs.go`	Lists investigate in labs overview.
`cmd/entire/cli/lifecycle.go`	Adopts investigate provenance env into session state.
`cmd/entire/cli/provenance/env.go`	Centralizes review/investigate env var names.
`cmd/entire/cli/review/cmd.go`	Uses shared gitexec HEAD helper.
`cmd/entire/cli/review/env.go`	Aliases review env names through provenance package.
`cmd/entire/cli/review/fix.go`	Uses shared fix-agent launcher.
`cmd/entire/cli/review/picker.go`	Uses shared accessible form helper.
`cmd/entire/cli/review/scope.go`	Uses shared gitexec runner.
`cmd/entire/cli/review/synthesis_sink.go`	Uses shared y/N prompt helper.
`cmd/entire/cli/review/tui_model.go`	Uses shared duration formatting.
`cmd/entire/cli/review/tui_text.go`	Uses shared TUI text helpers.
`cmd/entire/cli/review_helpers.go`	Adds shared HEAD checkpoint flag resolution.
`cmd/entire/cli/root.go`	Registers hidden investigate command.
`cmd/entire/cli/session/state.go`	Adds investigate session kind and fields.
`cmd/entire/cli/session/state_test.go`	Tests investigate session state serialization.
`cmd/entire/cli/settings/settings.go`	Adds investigate settings schema/merge support.
`cmd/entire/cli/settings/settings_test.go`	Tests investigate settings behavior.
`cmd/entire/cli/status.go`	Displays investigation status for HEAD checkpoints.
`cmd/entire/cli/status_test.go`	Tests investigation status output.
`cmd/entire/cli/strategy/manual_commit_condensation.go`	Propagates investigate metadata during condensation.
`cmd/entire/cli/strategy/manual_commit_condensation_test.go`	Tests condensation of investigation metadata.
`cmd/entire/cli/tuiutil/display.go`	Adds shared display-width and duration helpers.
`cmd/entire/cli/uiform/uiform.go`	Adds shared accessible huh form/prompt helpers.
`cmd/entire/cli/utils.go`	Delegates form/accessibility helpers to uiform.

Comments suppressed due to low confidence (1)

cmd/entire/cli/investigate/cmd.go:402

This interactive warning also echoes the raw issue URL, which can contain userinfo credentials. Use the redacted URL form for operator-facing output so tokens embedded in --issue-link are not leaked to the terminal or captured logs.

	fmt.Fprintln(cmd.ErrOrStderr(), warning)
	fmt.Fprintf(cmd.ErrOrStderr(), "Source: %s\n", issueLink)

The round, turn, and prompt coordinates on investigation sessions were audit-only with no consumers outside their own round-trip tests; the loop still tracks round/turn internally to render "Round X of Y" prompts but nothing downstream needs the persisted copies. Drop them from explain export, CommittedMetadata, session.State, the env-var contract, and the adoption + condensation paths. Only investigate_run_id and investigate_topic survive — enough to attribute a checkpoint to a run and display the topic. Also drop AttachSession + AttachOptions from attach.go: dead code left behind when `entire investigate attach` was removed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Entire-Checkpoint: aedacb2a168e

TestNullProgressSink_ImplementsInterface ended with a strings.HasPrefix(string(OutcomeQuorum), "qu") sanity check that could never fail. Replace the runtime interface assertion with a package-level var _ ProgressSink = nullProgressSink{} declaration (compile-time guard) and rename the function to TestNullProgressSink_NoPanic so it honestly describes what it verifies. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Entire-Checkpoint: 6c5e94c73c6b

Correctness: - clean.go: delete run dir before manifest so a failed cleanup leaves a recoverable manifest breadcrumb - show.go / fix.go: reject relative FindingsDoc at read time; commit manifest docstring to absolute paths only - loop.go: bump state-save failure to Warn; fileFingerprint now uses size+sha256 to catch sub-second same-length edits; classifyRunErr distinguishes spawn vs non-zero exit; remove redundant slogString wrappers - state.go: warn instead of swallowing unreadable state.json in List - tui_sink.go: Start(ctx) ctx-watcher pushes tea.Quit on cancel so Wait unblocks on early loop return; flip defer order in cmd.go so cancelTUI fires before Wait - tui_model.go: preserve rowStatusQueued for agents that never ran; Ctrl+C cancels from detail mode too (matches footer) - manifest.go: tighten file mode 0o644 → 0o600 (matches state.go) Security: - fix.go: wrap investigation prompt and findings body in <untrusted> envelopes with defanged close-tags so prior-agent-ingested untrusted seed content cannot inject instructions into the fix prompt - review/env.go: AppendReviewEnv now strips both ENTIRE_REVIEW_* AND ENTIRE_INVESTIGATE_* (symmetric to AppendInvestigateEnv) - lifecycle.go: doc note on the env+agent+SHA adoption trust model Layering: - provenance/env.go: own IsValidRunID (uses checkpoint/id.Pattern) - lifecycle.go: drop investigate import; use provenance.* directly Tests: - new TestAdoptInvestigateEnv_TagsSessionViaHandleLifecycleTurnStart mirrors the review side - new TestLaunchFixAgent_EmptyEnvFallback_StripsHostProvenance covers the cmd.Env==nil → os.Environ() branch - new tui_sink_test.go covers the ctx-cancel-unblocks-Wait contract - updated fix/loop/tui_model tests to match new behavior Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Entire-Checkpoint: b4aa8d29fb64

CI lint: - escape U+200B literals (staticcheck ST1018) Real bugs (cursor[bot] + Copilot review): - loop: ctx-cancel mid-cmd.Run() now classifies as OutcomeCancelled instead of being counted as a turn failure and tripping OutcomePaused after two consecutive cancels (#1) - loop: save state before returning OutcomePaused so --continue resumes from a snapshot that includes the failing turn (#11) - investigate fix: wrap context.Canceled as SilentError so Ctrl+C during the fix session doesn't print a cobra usage banner (#2) - cmd: redact URL userinfo on the issue-link Source: line in both interactive and non-interactive paths (#5) - issuelink: redact URL userinfo across gh stderr (not just argv) so a token in --issue-link cannot leak via the error path (#10) - cmd: outcome-aware footer — "Investigation ended" + resume hint for paused/cancelled, "Investigation complete" + fix hint only for Quorum/Stalled (#6) - cmd: validate maxTurns/quorum bounds after settings/flag merge so a hand-edited negative max_turns or oversized quorum errors cleanly instead of silently stalling (#7) - issuelink: tolerate GitHub URL trailing segments (/pull/123/files, trailing slash) — the regex now anchors prefix and ignores tail (#13) - picker: don't print "Saved investigate config" before persistence; moved to the caller after SaveLocal succeeds (#14) - picker: guard pickerFormOverride with atomic.Pointer so parallel tests that swap the override don't race (#12) Docs: - settings/loop: fix stale "0 → 3" max_turns doc; default is 2 (#8/#9) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Entire-Checkpoint: b0135abeee0d

…sume hints `entire investigate --findings` previously printed the literal string `<captured in manifest>` for terminal-outcome runs whose findings live inside the manifest JSON. That's a placeholder, not a path — viewers had no obvious way to actually read the findings without knowing about `entire investigate show`. New listing format: <run-id> · <topic> · <agents> · <when> view: entire investigate show <run-id> (every row, always) fix: entire investigate fix <run-id> (terminal outcomes only) resume: entire investigate --continue <id> (paused/cancelled) path: <findings.md path> (only when on-disk file still exists) The `view:` line points at the show subcommand, which works regardless of where findings live; `fix` is only suggested for terminal outcomes since paused/cancelled runs have incomplete findings; the on-disk path is only printed when it points at an extant file (terminal outcomes auto-clean the per-run dir, so the prior code's stale path was already suppressed there). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Entire-Checkpoint: 23da2a4fc345

Resolves conflict in cmd/entire/cli/review_helpers.go: keeps the branch's `headCheckpointFlags` 3-tuple return (HasReview, HasInvestigation, info) needed by the investigate side, while switching to main's new checkpoint.NewCommittedReader(ctx, repo, CommittedReaderOptions{}) signature so the v1/v2 read selection is handled inside the checkpoint package. Also fixes two test sites that drifted from main's NewV2GitStore signature (now takes just *git.Repository, no remote-name argument): head_checkpoint_flags_test.go and status_test.go. go.mod: promotes github.com/atotto/clipboard to a direct dependency per `go mod tidy` (it's used directly in this branch's code). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Entire-Checkpoint: 064894d05fa7

Move headCheckpointFlags, headHasReviewCheckpoint, and headHasInvestigateCheckpoint out of review_helpers.go (an import-cycle grab-bag) into head_checkpoint_flags.go so the existing head_checkpoint_flags_test.go pairs with a matching source file per Go convention. Pure move, no logic change: the functions stay in package cli (checkpoint access can't live in the review/ subpackage without cycling) and remain cross-feature — used by status plus the review and investigate re-run guards. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> Entire-Checkpoint: f31d2ea541b6

The file-level //nolint:ireturn,wrapcheck directive listed ireturn, but ireturn never fires in this file, so nolintlint flagged the directive as unused and CI failed. Keep wrapcheck (still needed for the osfs passthrough methods) and drop ireturn. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> Entire-Checkpoint: bdb7feab3ec4

Addresses four security findings on the path that feeds attacker-controlled GitHub content into agents launched with permission/sandbox bypass. 1. Sanitize untrusted single-line metadata. Issue title, author login, and label names render outside the <untrusted> envelope (the title as a top-level heading), so an injected newline + "# SYSTEM:" could land as document structure. New sanitizeInline collapses control chars/newlines and neutralizes a leading markdown control char. Applied to title, author, labels, and comment authors. 2. Defang </untrusted> case-insensitively with whitespace tolerance. The exact-string match missed "</untrusted >", "</UNTRUSTED>", "</untrusted\t>", which an LLM may still read as a real closing tag. Replaced with a regex; the shared writeUntrustedBlock chokepoint also covers fix.go's re-wrap. 3. Refuse non-interactive --issue-link by default. The CI path (remote content + auto-approving agent + no human gate) was the most dangerous. Added --allow-untrusted-seed; without it a non-interactive run now refuses instead of silently proceeding. 4. Validate run IDs at the source. A planted manifest with run_id "../../.." flowed through clean → RunDir → os.RemoveAll. List() now skips manifests whose run_id fails validateRunID and ResolveByRunID ignores invalid entries, so no unvalidated id can reach RunDir. RunDir's precondition is documented. Tests cover sanitizeInline, adversarial title/author/label rendering, close-tag variant defang, the non-interactive refuse/opt-in branches, and the manifest run-ID filtering. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> Entire-Checkpoint: a2ace8be65d1

…ue-link test TestInvestigate_IssueLink_ResolvesViaFakeGh runs `entire investigate --issue-link` via execx.NonInteractive (no TTY). The strict default added in 10e58ee now refuses such runs without --allow-untrusted-seed, so the test exited 1. The test consciously opts in — it is exercising the issue-link resolution + bootstrap path, not the refusal gate (covered by unit tests). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> Entire-Checkpoint: dd4053c251ee

Resolved conflict in cmd/entire/cli/gitrepo/alternates_fs.go: both sides dropped the unused ireturn nolint directive; kept main's more detailed wrapcheck rationale (functionally identical directive). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Entire-Checkpoint: a64caa818a4f

alishakawaguchi and others added 30 commits May 8, 2026 23:00

Merge branch 'main' into entire-labs-investigate

38cd769

investigate: update first-run banner to mention local settings

1b31a6c

Reflects the storage change in fix(investigate): store config in .entire/settings.local.json. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Entire-Checkpoint: 4e825ed88af2

alishakawaguchi and others added 5 commits May 19, 2026 12:18

Copilot AI review requested due to automatic review settings May 19, 2026 23:16

Copilot started reviewing on behalf of alishakawaguchi May 19, 2026 23:17 View session

cursor Bot reviewed May 19, 2026

View reviewed changes

Comment thread cmd/entire/cli/investigate/loop.go

Comment thread cmd/entire/cli/investigate/cmd.go

Comment thread cmd/entire/cli/investigate/loop.go

Copilot AI reviewed May 19, 2026

View reviewed changes

alishakawaguchi and others added 6 commits May 19, 2026 17:31

alishakawaguchi self-assigned this May 20, 2026

alishakawaguchi marked this pull request as ready for review May 20, 2026 21:07

alishakawaguchi requested a review from a team as a code owner May 20, 2026 21:07

alishakawaguchi and others added 6 commits May 26, 2026 09:19

Merge branch 'main' into entire-labs-investigate

93751ea

Merge branch 'main' into entire-labs-investigate

1c8daeb

Soph previously approved these changes May 27, 2026

View reviewed changes

alishakawaguchi dismissed Soph’s stale review via dabe501 May 27, 2026 21:50

evisdren approved these changes May 27, 2026

View reviewed changes

alishakawaguchi merged commit 42f10f1 into main May 27, 2026
9 checks passed

alishakawaguchi deleted the entire-labs-investigate branch May 27, 2026 23:04

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

labs: entire investigate — multi-agent investigation loop#1231

labs: entire investigate — multi-agent investigation loop#1231
alishakawaguchi merged 76 commits into
mainfrom
entire-labs-investigate

alishakawaguchi commented May 19, 2026 •

edited by entire Bot

Loading

Uh oh!

cursor Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

4 participants

Conversation

alishakawaguchi commented May 19, 2026 • edited by entire Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Supporting refactors (review/investigate boundary)

Security hardening (from adversarial review)

Test plan

Uh oh!

cursor Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

4 participants

alishakawaguchi commented May 19, 2026 •

edited by entire Bot

Loading